Unknown Metamorphic Malware Detection: Modelling with Fewer Relevant Features and Robust Feature Selection Techniques
نویسندگان
چکیده
Detection of metamorphic malware is a challenging problem as a result of high diversity in the internal code structure between generations. Code morphing/obfuscation when applied, reshapes malware code without compromising the maliciousness. As a result, signature based scanners fail to detect metamorphic malware. Prior research in the domain of metamorphic malware detection utilizes similarity matching techniques. This work focuses on the development of a statistical scanner for metamorphic virus detection by employing feature ranking methods such as Term FrequencyInverse Document Frequency (TF-IDF), Term Frequency-Inverse Document Frequency-Class Frequency (TF-IDF-CF), Categorical Proportional Distance (CPD), Galavotti-Sebastiani-Simi Coefficient (GSS), Weight of Evidence of Text (WET), Term Significance (TS), Odds Ratio (OR), Weighted Odds Ratio (WOR) MultiClass Odds Ratio (MOR) Comprehensive Measurement Feature Selection (CMFS) and Accuracy2 (ACC2). Malware and benign model for classification are developed by considering top ranked features obtained using individual feature selection methods. The proposed statistical detector detects Metamorphic worm (MWORM) and viruses which are generated using Next Generation Virus Construction Kit (NGVCK) with 100% accuracy and precision. Further, relevance of feature ranking methods at varying lengths are determined using McNemar test. Thus, the designed non–signature based scanner can detect sophisticated metamorphic malware, and can be used to support current antivirus products.
منابع مشابه
Malware Detection using Classification of Variable-Length Sequences
In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...
متن کاملThe Feature Selection and Intrusion Detection Problems
Cyber security is a serious global concern. The potential of cyber terrorism has posed a threat to national security; meanwhile the increasing prevalence of malware and incidents of cyber attacks hinder the utilization of the Internet to its greatest benefit and incur significant economic losses to individuals, enterprises, and public organizations. This paper presents some recent advances in i...
متن کاملA Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems
Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...
متن کاملA Parallel Genetic Algorithm Based Method for Feature Subset Selection in Intrusion Detection Systems
Intrusion detection systems are designed to provide security in computer networks, so that if the attacker crosses other security devices, they can detect and prevent the attack process. One of the most essential challenges in designing these systems is the so called curse of dimensionality. Therefore, in order to obtain satisfactory performance in these systems we have to take advantage of app...
متن کاملMetamorphic Malware Detection using Control Flow Graph Mining
Metamorphic malware propagation has persuaded the security society to consider about new approaches to confront this generation of malware with novel solutions. Control Flow Graph, CFG, has been successful in detection of simple malwares. By now, it needs to improve the CFG based detection methods to detect metamorphic malwares efficiently. Our Approach has improved the simple CFG with benefici...
متن کامل